Overview

Brought to you by YData

Dataset statistics

Number of variables29
Number of observations9480
Missing cells0
Missing cells (%)0.0%
Duplicate rows10
Duplicate rows (%)0.1%
Total size in memory2.6 MiB
Average record size in memory286.0 B

Variable types

Text1
DateTime5
Numeric8
Categorical14
Boolean1

Alerts

Dataset has 10 (0.1%) duplicate rowsDuplicates
brand_encoded is highly overall correlated with helpful_missing_flag and 2 other fieldsHigh correlation
category_group_encoded is highly overall correlated with purchase_missing_flagHigh correlation
fake_review_label is highly overall correlated with username_dup_flagHigh correlation
helpful_missing_flag is highly overall correlated with brand_encoded and 1 other fieldsHigh correlation
is_short is highly overall correlated with repetition_scoreHigh correlation
log_helpful is highly overall correlated with no_helpful_votes_flag and 1 other fieldsHigh correlation
multi_review_same_product_flag is highly overall correlated with username_dup_flagHigh correlation
no_helpful_votes_flag is highly overall correlated with log_helpfulHigh correlation
product_name_match_flag is highly overall correlated with semantic_mismatch_scoreHigh correlation
purchase_encoded is highly overall correlated with purchase_missing_flagHigh correlation
purchase_missing_flag is highly overall correlated with brand_encoded and 2 other fieldsHigh correlation
recommend_encoded is highly overall correlated with recommend_missing_flag and 1 other fieldsHigh correlation
recommend_missing_flag is highly overall correlated with recommend_encodedHigh correlation
repetition_score is highly overall correlated with is_short and 2 other fieldsHigh correlation
review_length is highly overall correlated with repetition_score and 1 other fieldsHigh correlation
reviews.numHelpful is highly overall correlated with log_helpfulHigh correlation
reviews.rating is highly overall correlated with recommend_encodedHigh correlation
semantic_mismatch_score is highly overall correlated with helpful_missing_flag and 2 other fieldsHigh correlation
text_length is highly overall correlated with repetition_score and 1 other fieldsHigh correlation
unrelated_product_flag is highly overall correlated with brand_encoded and 1 other fieldsHigh correlation
username_dup_flag is highly overall correlated with fake_review_label and 1 other fieldsHigh correlation
recommend_missing_flag is highly imbalanced (80.3%)Imbalance
recommend_encoded is highly imbalanced (71.5%)Imbalance
no_helpful_votes_flag is highly imbalanced (73.1%)Imbalance
is_short is highly imbalanced (75.0%)Imbalance
username_dup_flag is highly imbalanced (58.7%)Imbalance
multi_review_same_day_flag is highly imbalanced (96.6%)Imbalance
multi_review_same_product_flag is highly imbalanced (79.2%)Imbalance
fake_review_label is highly imbalanced (62.8%)Imbalance
reviews.numHelpful is highly skewed (γ1 = 31.6344377)Skewed
reviews.numHelpful has 9045 (95.4%) zerosZeros
log_helpful has 9045 (95.4%) zerosZeros
sentiment_polarity has 562 (5.9%) zerosZeros
category_group_encoded has 375 (4.0%) zerosZeros

Reproduction

Analysis started2025-10-02 19:04:51.905999
Analysis finished2025-10-02 19:05:04.421797
Duration12.52 seconds
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

id
Text

Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size638.9 KiB
2025-10-02T19:05:04.670904image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters189600
Distinct characters64
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.2%

Sample

1st rowAV13O1A8GV-KLJ3akUyj
2nd rowAV14LG0R-jtxr-f38QfS
3rd rowAV14LG0R-jtxr-f38QfS
4th rowAV16khLE-jtxr-f38VFn
5th rowAV16khLE-jtxr-f38VFn
ValueCountFrequency (%)
avpf3vofilapnd_xjpun3264
34.4%
avpf0eb2ljejml43evst847
 
8.9%
avpe41tqilapnd_xqh3d757
 
8.0%
avpf2tw1ilapnd_xjflc669
 
7.1%
avpe59io1cnluz0-zgdu668
 
7.0%
av1l8zrzvkc47qavhnav644
 
6.8%
avpe8gsiljejml43y6ed367
 
3.9%
av1ygdqsgv-klj3adc-o333
 
3.5%
avpe31o71cnluz0-yrsd245
 
2.6%
avpe9w4d1cnluz0-avf0213
 
2.2%
Other values (67)1473
15.5%
2025-10-02T19:05:05.031618image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A15869
 
8.4%
V14790
 
7.8%
p11423
 
6.0%
n10494
 
5.5%
f9667
 
5.1%
l8039
 
4.2%
36743
 
3.6%
D6394
 
3.4%
i5841
 
3.1%
e5537
 
2.9%
Other values (54)94803
50.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)189600
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A15869
 
8.4%
V14790
 
7.8%
p11423
 
6.0%
n10494
 
5.5%
f9667
 
5.1%
l8039
 
4.2%
36743
 
3.6%
D6394
 
3.4%
i5841
 
3.1%
e5537
 
2.9%
Other values (54)94803
50.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)189600
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A15869
 
8.4%
V14790
 
7.8%
p11423
 
6.0%
n10494
 
5.5%
f9667
 
5.1%
l8039
 
4.2%
36743
 
3.6%
D6394
 
3.4%
i5841
 
3.1%
e5537
 
2.9%
Other values (54)94803
50.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)189600
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A15869
 
8.4%
V14790
 
7.8%
p11423
 
6.0%
n10494
 
5.5%
f9667
 
5.1%
l8039
 
4.2%
36743
 
3.6%
D6394
 
3.4%
i5841
 
3.1%
e5537
 
2.9%
Other values (54)94803
50.0%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size74.2 KiB
Minimum2014-02-18 02:01:47+00:00
Maximum2017-07-26 23:26:15+00:00
Invalid dates0
Invalid dates (%)0.0%
2025-10-02T19:05:05.171592image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:05.332545image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct69
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size74.2 KiB
Minimum2018-01-30 06:08:52+00:00
Maximum2018-02-05 11:30:15+00:00
Invalid dates0
Invalid dates (%)0.0%
2025-10-02T19:05:05.478160image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:05.628552image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct4041
Distinct (%)42.6%
Missing0
Missing (%)0.0%
Memory size74.2 KiB
Minimum2007-08-07 00:00:00+00:00
Maximum2018-01-10 19:56:37+00:00
Invalid dates0
Invalid dates (%)0.0%
2025-10-02T19:05:05.786382image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:05.932557image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct985
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Memory size74.2 KiB
Minimum2017-03-10 06:55:39+00:00
Maximum2018-02-05 10:21:42+00:00
Invalid dates0
Invalid dates (%)0.0%
2025-10-02T19:05:06.074245image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:06.214807image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct342
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size74.2 KiB
Minimum2017-03-09 02:48:00+00:00
Maximum2018-01-26 05:42:00+00:00
Invalid dates0
Invalid dates (%)0.0%
2025-10-02T19:05:06.357553image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:06.493850image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

reviews.numHelpful
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct38
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.20875527
Minimum0
Maximum141
Zeros9045
Zeros (%)95.4%
Negative0
Negative (%)0.0%
Memory size74.2 KiB
2025-10-02T19:05:06.621379image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum141
Range141
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.5020695
Coefficient of variation (CV)11.985659
Kurtosis1397.2197
Mean0.20875527
Median Absolute Deviation (MAD)0
Skewness31.634438
Sum1979
Variance6.2603517
MonotonicityNot monotonic
2025-10-02T19:05:06.760789image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
09045
95.4%
1232
 
2.4%
254
 
0.6%
343
 
0.5%
424
 
0.3%
613
 
0.1%
710
 
0.1%
59
 
0.1%
85
 
0.1%
124
 
< 0.1%
Other values (28)41
 
0.4%
ValueCountFrequency (%)
09045
95.4%
1232
 
2.4%
254
 
0.6%
343
 
0.5%
424
 
0.3%
59
 
0.1%
613
 
0.1%
710
 
0.1%
85
 
0.1%
94
 
< 0.1%
ValueCountFrequency (%)
1411
< 0.1%
961
< 0.1%
561
< 0.1%
521
< 0.1%
471
< 0.1%
461
< 0.1%
451
< 0.1%
411
< 0.1%
391
< 0.1%
351
< 0.1%

reviews.rating
Categorical

High correlation 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
5
6001 
4
2697 
3
 
425
1
 
227
2
 
130

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5
2nd row5
3rd row5
4th row1
5th row1

Common Values

ValueCountFrequency (%)
56001
63.3%
42697
28.4%
3425
 
4.5%
1227
 
2.4%
2130
 
1.4%

Length

2025-10-02T19:05:06.878182image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:06.956924image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
56001
63.3%
42697
28.4%
3425
 
4.5%
1227
 
2.4%
2130
 
1.4%

Most occurring characters

ValueCountFrequency (%)
56001
63.3%
42697
28.4%
3425
 
4.5%
1227
 
2.4%
2130
 
1.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
56001
63.3%
42697
28.4%
3425
 
4.5%
1227
 
2.4%
2130
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
56001
63.3%
42697
28.4%
3425
 
4.5%
1227
 
2.4%
2130
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
56001
63.3%
42697
28.4%
3425
 
4.5%
1227
 
2.4%
2130
 
1.4%

purchase_missing_flag
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
1
6033 
0
3447 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
16033
63.6%
03447
36.4%

Length

2025-10-02T19:05:07.053819image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:07.116441image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
16033
63.6%
03447
36.4%

Most occurring characters

ValueCountFrequency (%)
16033
63.6%
03447
36.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
16033
63.6%
03447
36.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
16033
63.6%
03447
36.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
16033
63.6%
03447
36.4%

purchase_encoded
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
6033 
1
2959 
2
 
488

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
06033
63.6%
12959
31.2%
2488
 
5.1%

Length

2025-10-02T19:05:07.553329image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:07.621706image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
06033
63.6%
12959
31.2%
2488
 
5.1%

Most occurring characters

ValueCountFrequency (%)
06033
63.6%
12959
31.2%
2488
 
5.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
06033
63.6%
12959
31.2%
2488
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
06033
63.6%
12959
31.2%
2488
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
06033
63.6%
12959
31.2%
2488
 
5.1%

recommend_missing_flag
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
9190 
1
 
290

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
09190
96.9%
1290
 
3.1%

Length

2025-10-02T19:05:07.709339image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:07.791113image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
09190
96.9%
1290
 
3.1%

Most occurring characters

ValueCountFrequency (%)
09190
96.9%
1290
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
09190
96.9%
1290
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
09190
96.9%
1290
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
09190
96.9%
1290
 
3.1%

recommend_encoded
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
2
8782 
1
 
408
0
 
290

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
28782
92.6%
1408
 
4.3%
0290
 
3.1%

Length

2025-10-02T19:05:07.867246image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:07.934499image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
28782
92.6%
1408
 
4.3%
0290
 
3.1%

Most occurring characters

ValueCountFrequency (%)
28782
92.6%
1408
 
4.3%
0290
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
28782
92.6%
1408
 
4.3%
0290
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
28782
92.6%
1408
 
4.3%
0290
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
28782
92.6%
1408
 
4.3%
0290
 
3.1%

helpful_missing_flag
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
1
5260 
0
4220 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
15260
55.5%
04220
44.5%

Length

2025-10-02T19:05:08.017014image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:08.082571image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
15260
55.5%
04220
44.5%

Most occurring characters

ValueCountFrequency (%)
15260
55.5%
04220
44.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
15260
55.5%
04220
44.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
15260
55.5%
04220
44.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
15260
55.5%
04220
44.5%

no_helpful_votes_flag
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
1
9045 
0
 
435

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
19045
95.4%
0435
 
4.6%

Length

2025-10-02T19:05:08.164817image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:08.228364image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
19045
95.4%
0435
 
4.6%

Most occurring characters

ValueCountFrequency (%)
19045
95.4%
0435
 
4.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
19045
95.4%
0435
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
19045
95.4%
0435
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
19045
95.4%
0435
 
4.6%

log_helpful
Real number (ℝ)

High correlation  Zeros 

Distinct38
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.055888228
Minimum0
Maximum4.9558271
Zeros9045
Zeros (%)95.4%
Negative0
Negative (%)0.0%
Memory size74.2 KiB
2025-10-02T19:05:08.319280image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4.9558271
Range4.9558271
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.30461053
Coefficient of variation (CV)5.4503523
Kurtosis67.163843
Mean0.055888228
Median Absolute Deviation (MAD)0
Skewness7.3957479
Sum529.8204
Variance0.092787576
MonotonicityNot monotonic
2025-10-02T19:05:08.445368image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
09045
95.4%
0.6931471806232
 
2.4%
1.09861228954
 
0.6%
1.38629436143
 
0.5%
1.60943791224
 
0.3%
1.94591014913
 
0.1%
2.07944154210
 
0.1%
1.7917594699
 
0.1%
2.1972245775
 
0.1%
2.5649493574
 
< 0.1%
Other values (28)41
 
0.4%
ValueCountFrequency (%)
09045
95.4%
0.6931471806232
 
2.4%
1.09861228954
 
0.6%
1.38629436143
 
0.5%
1.60943791224
 
0.3%
1.7917594699
 
0.1%
1.94591014913
 
0.1%
2.07944154210
 
0.1%
2.1972245775
 
0.1%
2.3025850934
 
< 0.1%
ValueCountFrequency (%)
4.9558270581
< 0.1%
4.5747109791
< 0.1%
4.0430512681
< 0.1%
3.9702919141
< 0.1%
3.8712010111
< 0.1%
3.8501476021
< 0.1%
3.8286413961
< 0.1%
3.7376696181
< 0.1%
3.6888794541
< 0.1%
3.5835189381
< 0.1%

text_length
Real number (ℝ)

High correlation 

Distinct126
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.577848
Minimum0
Maximum553
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size74.2 KiB
2025-10-02T19:05:08.569156image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q18
median12
Q318
95-th percentile39
Maximum553
Range553
Interquartile range (IQR)10

Descriptive statistics

Standard deviation15.110852
Coefficient of variation (CV)0.9700218
Kurtosis192.90745
Mean15.577848
Median Absolute Deviation (MAD)4
Skewness8.2539535
Sum147678
Variance228.33786
MonotonicityNot monotonic
2025-10-02T19:05:08.703193image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10720
 
7.6%
9664
 
7.0%
8647
 
6.8%
7630
 
6.6%
6619
 
6.5%
11595
 
6.3%
12557
 
5.9%
13451
 
4.8%
14420
 
4.4%
5384
 
4.1%
Other values (116)3793
40.0%
ValueCountFrequency (%)
01
 
< 0.1%
136
 
0.4%
274
 
0.8%
378
 
0.8%
4206
 
2.2%
5384
4.1%
6619
6.5%
7630
6.6%
8647
6.8%
9664
7.0%
ValueCountFrequency (%)
5531
< 0.1%
2291
< 0.1%
2101
< 0.1%
1991
< 0.1%
1891
< 0.1%
1801
< 0.1%
1701
< 0.1%
1641
< 0.1%
1601
< 0.1%
1591
< 0.1%

is_short
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.4 KiB
False
9085 
True
 
395
ValueCountFrequency (%)
False9085
95.8%
True395
 
4.2%
2025-10-02T19:05:08.805609image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

sentiment_polarity
Real number (ℝ)

Zeros 

Distinct2161
Distinct (%)22.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.34275005
Minimum-1
Maximum1
Zeros562
Zeros (%)5.9%
Negative607
Negative (%)6.4%
Memory size74.2 KiB
2025-10-02T19:05:08.906006image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-0.05
Q10.175
median0.36190476
Q30.5
95-th percentile0.8
Maximum1
Range2
Interquartile range (IQR)0.325

Descriptive statistics

Standard deviation0.26629874
Coefficient of variation (CV)0.77694733
Kurtosis1.2366286
Mean0.34275005
Median Absolute Deviation (MAD)0.15791667
Skewness-0.35748081
Sum3249.2705
Variance0.070915017
MonotonicityNot monotonic
2025-10-02T19:05:09.051566image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.5619
 
6.5%
0562
 
5.9%
0.8331
 
3.5%
0.4333333333216
 
2.3%
0.25211
 
2.2%
0.4197
 
2.1%
0.7173
 
1.8%
0.6147
 
1.6%
1142
 
1.5%
0.3666666667130
 
1.4%
Other values (2151)6752
71.2%
ValueCountFrequency (%)
-111
0.1%
-0.851
 
< 0.1%
-0.81666666671
 
< 0.1%
-0.83
 
< 0.1%
-0.753
 
< 0.1%
-0.71428571433
 
< 0.1%
-0.71
 
< 0.1%
-0.78
0.1%
-0.66666666671
 
< 0.1%
-0.651
 
< 0.1%
ValueCountFrequency (%)
1142
1.5%
0.93333333333
 
< 0.1%
0.942
 
0.4%
0.8751
 
< 0.1%
0.86666666677
 
0.1%
0.861
 
< 0.1%
0.851
 
< 0.1%
0.8520
 
0.2%
0.833333333313
 
0.1%
0.8253
 
< 0.1%

review_length
Real number (ℝ)

High correlation 

Distinct126
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.577848
Minimum0
Maximum553
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size74.2 KiB
2025-10-02T19:05:09.195078image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q18
median12
Q318
95-th percentile39
Maximum553
Range553
Interquartile range (IQR)10

Descriptive statistics

Standard deviation15.110852
Coefficient of variation (CV)0.9700218
Kurtosis192.90745
Mean15.577848
Median Absolute Deviation (MAD)4
Skewness8.2539535
Sum147678
Variance228.33786
MonotonicityNot monotonic
2025-10-02T19:05:09.335967image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10720
 
7.6%
9664
 
7.0%
8647
 
6.8%
7630
 
6.6%
6619
 
6.5%
11595
 
6.3%
12557
 
5.9%
13451
 
4.8%
14420
 
4.4%
5384
 
4.1%
Other values (116)3793
40.0%
ValueCountFrequency (%)
01
 
< 0.1%
136
 
0.4%
274
 
0.8%
378
 
0.8%
4206
 
2.2%
5384
4.1%
6619
6.5%
7630
6.6%
8647
6.8%
9664
7.0%
ValueCountFrequency (%)
5531
< 0.1%
2291
< 0.1%
2101
< 0.1%
1991
< 0.1%
1891
< 0.1%
1801
< 0.1%
1701
< 0.1%
1641
< 0.1%
1601
< 0.1%
1591
< 0.1%

username_dup_flag
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
8693 
1
 
787

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
08693
91.7%
1787
 
8.3%

Length

2025-10-02T19:05:09.452216image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:09.519556image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
08693
91.7%
1787
 
8.3%

Most occurring characters

ValueCountFrequency (%)
08693
91.7%
1787
 
8.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
08693
91.7%
1787
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
08693
91.7%
1787
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
08693
91.7%
1787
 
8.3%

multi_review_same_day_flag
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
9446 
1
 
34

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
09446
99.6%
134
 
0.4%

Length

2025-10-02T19:05:09.600601image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:09.665858image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
09446
99.6%
134
 
0.4%

Most occurring characters

ValueCountFrequency (%)
09446
99.6%
134
 
0.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
09446
99.6%
134
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
09446
99.6%
134
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
09446
99.6%
134
 
0.4%

multi_review_same_product_flag
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
9169 
1
 
311

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
09169
96.7%
1311
 
3.3%

Length

2025-10-02T19:05:09.747818image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:09.817301image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
09169
96.7%
1311
 
3.3%

Most occurring characters

ValueCountFrequency (%)
09169
96.7%
1311
 
3.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
09169
96.7%
1311
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
09169
96.7%
1311
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
09169
96.7%
1311
 
3.3%

brand_encoded
Real number (ℝ)

High correlation 

Distinct71
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.518038
Minimum0
Maximum70
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size74.2 KiB
2025-10-02T19:05:09.937685image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile13
Q113
median25
Q352
95-th percentile67
Maximum70
Range70
Interquartile range (IQR)39

Descriptive statistics

Standard deviation21.395622
Coefficient of variation (CV)0.63833158
Kurtosis-1.5115072
Mean33.518038
Median Absolute Deviation (MAD)14
Skewness0.31008717
Sum317751
Variance457.77265
MonotonicityNot monotonic
2025-10-02T19:05:10.081582image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
133264
34.4%
52847
 
8.9%
17757
 
8.0%
55669
 
7.1%
64668
 
7.0%
41644
 
6.8%
67452
 
4.8%
44367
 
3.9%
69333
 
3.5%
24213
 
2.2%
Other values (61)1266
 
13.4%
ValueCountFrequency (%)
04
 
< 0.1%
15
 
0.1%
286
0.9%
334
 
0.4%
473
0.8%
52
 
< 0.1%
69
 
0.1%
72
 
< 0.1%
841
0.4%
925
 
0.3%
ValueCountFrequency (%)
7059
 
0.6%
69333
3.5%
681
 
< 0.1%
67452
4.8%
665
 
0.1%
651
 
< 0.1%
64668
7.0%
631
 
< 0.1%
621
 
< 0.1%
612
 
< 0.1%

category_group_encoded
Real number (ℝ)

High correlation  Zeros 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.7056962
Minimum0
Maximum8
Zeros375
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size74.2 KiB
2025-10-02T19:05:10.195708image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median7
Q37
95-th percentile8
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.5398115
Coefficient of variation (CV)0.22962738
Kurtosis12.62331
Mean6.7056962
Median Absolute Deviation (MAD)0
Skewness-3.6588946
Sum63570
Variance2.3710193
MonotonicityNot monotonic
2025-10-02T19:05:10.295929image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
78034
84.7%
8808
 
8.5%
0375
 
4.0%
3140
 
1.5%
554
 
0.6%
136
 
0.4%
618
 
0.2%
213
 
0.1%
42
 
< 0.1%
ValueCountFrequency (%)
0375
 
4.0%
136
 
0.4%
213
 
0.1%
3140
 
1.5%
42
 
< 0.1%
554
 
0.6%
618
 
0.2%
78034
84.7%
8808
 
8.5%
ValueCountFrequency (%)
8808
 
8.5%
78034
84.7%
618
 
0.2%
554
 
0.6%
42
 
< 0.1%
3140
 
1.5%
213
 
0.1%
136
 
0.4%
0375
 
4.0%

product_name_match_flag
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
5321 
1
4159 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
05321
56.1%
14159
43.9%

Length

2025-10-02T19:05:10.404078image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:10.470331image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
05321
56.1%
14159
43.9%

Most occurring characters

ValueCountFrequency (%)
05321
56.1%
14159
43.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
05321
56.1%
14159
43.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
05321
56.1%
14159
43.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
05321
56.1%
14159
43.9%

unrelated_product_flag
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
6064 
1
3416 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
06064
64.0%
13416
36.0%

Length

2025-10-02T19:05:10.551181image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:10.617102image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
06064
64.0%
13416
36.0%

Most occurring characters

ValueCountFrequency (%)
06064
64.0%
13416
36.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
06064
64.0%
13416
36.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
06064
64.0%
13416
36.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
06064
64.0%
13416
36.0%

semantic_mismatch_score
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
1
4375 
0
2924 
2
2181 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row0

Common Values

ValueCountFrequency (%)
14375
46.1%
02924
30.8%
22181
23.0%

Length

2025-10-02T19:05:10.701057image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:10.774723image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
14375
46.1%
02924
30.8%
22181
23.0%

Most occurring characters

ValueCountFrequency (%)
14375
46.1%
02924
30.8%
22181
23.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
14375
46.1%
02924
30.8%
22181
23.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
14375
46.1%
02924
30.8%
22181
23.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
14375
46.1%
02924
30.8%
22181
23.0%

repetition_score
Real number (ℝ)

High correlation 

Distinct236
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.12999361
Minimum0
Maximum1
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size74.2 KiB
2025-10-02T19:05:10.904559image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.052631579
Q10.08
median0.11111111
Q30.15384615
95-th percentile0.25
Maximum1
Range1
Interquartile range (IQR)0.073846154

Descriptive statistics

Standard deviation0.090575472
Coefficient of variation (CV)0.69676866
Kurtosis34.92107
Mean0.12999361
Median Absolute Deviation (MAD)0.034188034
Skewness4.5749584
Sum1232.3394
Variance0.0082039161
MonotonicityNot monotonic
2025-10-02T19:05:11.054603image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1666666667723
 
7.6%
0.125716
 
7.6%
0.1709
 
7.5%
0.1428571429693
 
7.3%
0.1111111111640
 
6.8%
0.09090909091547
 
5.8%
0.2515
 
5.4%
0.08333333333476
 
5.0%
0.07692307692364
 
3.8%
0.07142857143329
 
3.5%
Other values (226)3768
39.7%
ValueCountFrequency (%)
01
< 0.1%
0.020833333331
< 0.1%
0.022727272731
< 0.1%
0.023809523811
< 0.1%
0.0251
< 0.1%
0.02531645571
< 0.1%
0.025974025971
< 0.1%
0.026315789472
< 0.1%
0.026785714291
< 0.1%
0.027124773961
< 0.1%
ValueCountFrequency (%)
137
0.4%
0.8251
 
< 0.1%
0.752
 
< 0.1%
0.66666666671
 
< 0.1%
0.64705882351
 
< 0.1%
0.585
0.9%
0.46153846151
 
< 0.1%
0.42857142865
 
0.1%
0.430
 
0.3%
0.3756
 
0.1%

fake_review_label
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size463.0 KiB
0
8800 
1
 
680

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9480
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
08800
92.8%
1680
 
7.2%

Length

2025-10-02T19:05:11.183633image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-02T19:05:11.248578image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
08800
92.8%
1680
 
7.2%

Most occurring characters

ValueCountFrequency (%)
08800
92.8%
1680
 
7.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
08800
92.8%
1680
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
08800
92.8%
1680
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9480
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
08800
92.8%
1680
 
7.2%

Interactions

2025-10-02T19:05:02.870187image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:54.586529image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.601772image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.536549image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:57.722912image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.769515image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:59.896806image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:01.264651image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:02.985919image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:54.714233image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.737947image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.656267image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:57.854161image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.884298image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:00.072211image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:01.475766image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:03.112095image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:54.828474image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.843760image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.769611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:57.976583image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.989964image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:00.233778image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:01.647300image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:03.224566image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:54.954709image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.952278image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.879450image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.100320image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:59.106660image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:00.412191image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:01.836938image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:03.357515image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.088624image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.072343image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:57.002339image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.239291image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:59.234424image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:00.594632image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:02.045414image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:03.467486image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.224380image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.197037image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:57.112294image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.381186image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:59.370737image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:00.758756image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:02.207613image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:03.594102image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.357218image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.313056image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:57.256293image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.510382image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:59.557867image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:00.920900image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:02.332004image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:03.718071image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:55.482515image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:56.428783image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:57.377541image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:58.641102image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:04:59.738496image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:01.091976image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-02T19:05:02.747277image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-10-02T19:05:11.345621image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
brand_encodedcategory_group_encodedfake_review_labelhelpful_missing_flagis_shortlog_helpfulmulti_review_same_day_flagmulti_review_same_product_flagno_helpful_votes_flagproduct_name_match_flagpurchase_encodedpurchase_missing_flagrecommend_encodedrecommend_missing_flagrepetition_scorereview_lengthreviews.numHelpfulreviews.ratingsemantic_mismatch_scoresentiment_polaritytext_lengthunrelated_product_flagusername_dup_flag
brand_encoded1.000-0.1490.2180.8370.1860.0090.0670.1800.3190.3160.4720.6010.2660.3100.225-0.1680.0090.1570.395-0.031-0.1680.5130.222
category_group_encoded-0.1491.0000.0550.2530.0590.0460.0900.0340.1150.1520.3790.5090.1420.053-0.0610.0750.0460.1270.149-0.0120.0750.1670.028
fake_review_label0.2180.0551.0000.2040.1620.0190.1570.4620.0100.1680.0990.0770.0200.0000.2760.0070.0000.0430.1890.0910.0070.1050.842
helpful_missing_flag0.8370.2530.2041.0000.1880.2430.0280.0450.2440.3000.4170.3080.1380.1360.3940.0260.0540.1110.5250.0830.0260.4650.154
is_short0.1860.0590.1620.1881.0000.1860.0000.0000.1600.1290.2420.1000.0060.0000.7200.0220.0000.0410.0880.1740.0220.0230.046
log_helpful0.0090.0460.0190.2430.1861.0000.0000.0001.0000.0380.1770.1870.0380.0090.0110.0061.0000.0340.037-0.0340.0060.0190.000
multi_review_same_day_flag0.0670.0900.1570.0280.0000.0001.0000.3210.0000.0000.0470.0190.0700.0650.0000.0760.0000.0170.0000.0380.0760.0000.196
multi_review_same_product_flag0.1800.0340.4620.0450.0000.0000.3211.0000.0030.0130.0520.0530.0370.0360.0000.0000.0000.0200.0150.0410.0000.0020.611
no_helpful_votes_flag0.3190.1150.0100.2440.1601.0000.0000.0031.0000.0300.2440.1850.0380.0000.1470.1480.2380.0640.0510.0450.1480.0220.000
product_name_match_flag0.3160.1520.1680.3000.1290.0380.0000.0130.0301.0000.1870.1810.0890.0870.3190.1450.0280.0920.7870.1400.1450.1160.074
purchase_encoded0.4720.3790.0990.4170.2420.1770.0470.0520.2440.1871.0001.0000.1070.0840.2260.0970.0570.1690.1950.0840.0970.1790.083
purchase_missing_flag0.6010.5090.0770.3080.1000.1870.0190.0530.1850.1811.0001.0000.1460.0730.2100.1270.0650.2080.2660.0830.1270.1700.075
recommend_encoded0.2660.1420.0200.1380.0060.0380.0700.0370.0380.0890.1070.1461.0001.0000.0660.1310.0000.5430.0530.1590.1310.0800.026
recommend_missing_flag0.3100.0530.0000.1360.0000.0090.0650.0360.0000.0870.0840.0731.0001.0000.0930.1690.0000.1770.0740.0920.1690.0660.010
repetition_score0.225-0.0610.2760.3940.7200.0110.0000.0000.1470.3190.2260.2100.0660.0931.000-0.7410.0110.0420.2210.216-0.7410.1480.096
review_length-0.1680.0750.0070.0260.0220.0060.0760.0000.1480.1450.0970.1270.1310.169-0.7411.0000.0060.0450.046-0.2521.0000.0800.000
reviews.numHelpful0.0090.0460.0000.0540.0001.0000.0000.0000.2380.0280.0570.0650.0000.0000.0110.0061.0000.0000.012-0.0340.0060.0000.000
reviews.rating0.1570.1270.0430.1110.0410.0340.0170.0200.0640.0920.1690.2080.5430.1770.0420.0450.0001.0000.0460.1230.0450.0570.047
semantic_mismatch_score0.3950.1490.1890.5250.0880.0370.0000.0150.0510.7870.1950.2660.0530.0740.2210.0460.0120.0461.0000.0850.0460.7710.095
sentiment_polarity-0.031-0.0120.0910.0830.174-0.0340.0380.0410.0450.1400.0840.0830.1590.0920.216-0.252-0.0340.1230.0851.000-0.2520.0500.024
text_length-0.1680.0750.0070.0260.0220.0060.0760.0000.1480.1450.0970.1270.1310.169-0.7411.0000.0060.0450.046-0.2521.0000.0800.000
unrelated_product_flag0.5130.1670.1050.4650.0230.0190.0000.0020.0220.1160.1790.1700.0800.0660.1480.0800.0000.0570.7710.0500.0801.0000.064
username_dup_flag0.2220.0280.8420.1540.0460.0000.1960.6110.0000.0740.0830.0750.0260.0100.0960.0000.0000.0470.0950.0240.0000.0641.000

Missing values

2025-10-02T19:05:03.944710image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-10-02T19:05:04.235870image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

iddateAddeddateUpdatedreviews.datereviews.dateAddedreviews.dateSeenreviews.numHelpfulreviews.ratingpurchase_missing_flagpurchase_encodedrecommend_missing_flagrecommend_encodedhelpful_missing_flagno_helpful_votes_flaglog_helpfultext_lengthis_shortsentiment_polarityreview_lengthusername_dup_flagmulti_review_same_day_flagmulti_review_same_product_flagbrand_encodedcategory_group_encodedproduct_name_match_flagunrelated_product_flagsemantic_mismatch_scorerepetition_scorefake_review_label
0AV13O1A8GV-KLJ3akUyj2017-07-25 00:52:42+00:002018-02-05 08:36:58+00:002012-11-30 06:21:45+00:002018-02-04 07:28:12+00:002018-01-15 04:45:00+00:000.051010010.019False0.133333191006510010.0526321
1AV14LG0R-jtxr-f38QfS2017-07-25 05:16:03+00:002018-02-05 11:27:45+00:002017-07-09 00:00:00+00:002017-09-23 02:53:06+00:002017-09-16 09:45:00+00:000.050210110.06False0.70000061113320010.1666671
2AV14LG0R-jtxr-f38QfS2017-07-25 05:16:03+00:002018-02-05 11:27:45+00:002017-07-09 00:00:00+00:002017-09-06 04:49:31+00:002017-08-23 10:37:00+00:000.050210110.02True0.70000021113320010.5000001
3AV16khLE-jtxr-f38VFn2017-07-25 16:26:19+00:002018-02-05 11:25:51+00:002016-01-06 00:00:00+00:002017-09-11 17:13:57+00:002017-09-04 12:18:00+00:000.010101110.055False-0.007175551003000120.0363641
4AV16khLE-jtxr-f38VFn2017-07-25 16:26:19+00:002018-02-05 11:25:51+00:002016-12-21 00:00:00+00:002017-09-11 17:13:57+00:002017-09-04 12:18:00+00:000.010101110.014False0.000000140003001000.2142860
5AV16khLE-jtxr-f38VFn2017-07-25 16:26:19+00:002018-02-05 11:25:51+00:002016-04-20 00:00:00+00:002017-09-11 17:13:57+00:002017-09-04 12:18:00+00:000.010101110.021False-0.012500211003001110.0952381
6AV16khLE-jtxr-f38VFn2017-07-25 16:26:19+00:002018-02-05 11:25:51+00:002016-02-08 00:00:00+00:002017-09-11 17:13:57+00:002017-09-04 12:18:00+00:000.010101110.018False-0.094643180003000010.0555560
7AV16khLE-jtxr-f38VFn2017-07-25 16:26:19+00:002018-02-05 11:25:51+00:002016-02-21 00:00:00+00:002017-09-11 17:13:57+00:002017-09-04 12:18:00+00:000.010101110.018False0.170000180003000010.1111110
8AV16khLE-jtxr-f38VFn2017-07-25 16:26:19+00:002018-02-05 11:25:51+00:002016-03-28 00:00:00+00:002017-09-11 17:13:57+00:002017-09-04 12:18:00+00:000.010101110.016False-0.137500160003000010.0625000
9AV16khLE-jtxr-f38VFn2017-07-25 16:26:19+00:002018-02-05 11:25:51+00:002016-03-21 00:00:00+00:002017-09-11 17:13:57+00:002017-09-04 12:18:00+00:000.010101110.017False0.071429170003000010.1176470
iddateAddeddateUpdatedreviews.datereviews.dateAddedreviews.dateSeenreviews.numHelpfulreviews.ratingpurchase_missing_flagpurchase_encodedrecommend_missing_flagrecommend_encodedhelpful_missing_flagno_helpful_votes_flaglog_helpfultext_lengthis_shortsentiment_polarityreview_lengthusername_dup_flagmulti_review_same_day_flagmulti_review_same_product_flagbrand_encodedcategory_group_encodedproduct_name_match_flagunrelated_product_flagsemantic_mismatch_scorerepetition_scorefake_review_label
9470AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002014-12-22 00:00:00+00:002017-08-24 00:14:12+00:002017-08-16 12:56:00+00:000.050102110.016False0.500000160001371110.0625000
9471AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002014-12-22 00:00:00+00:002017-08-24 00:14:12+00:002017-08-16 12:56:00+00:000.050102110.016False0.388889160001371000.1250000
9472AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002012-01-26 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.011False0.400000110001370010.0909090
9473AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002012-02-11 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.012False0.050000120001370010.1666670
9474AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002014-12-03 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.018False0.366667180001371000.0555560
9475AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002014-12-30 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.013False0.400000130001370010.0769230
9476AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002014-12-27 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.017False0.133333170001371000.0588240
9477AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002014-12-06 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.013False0.266667130001371000.0769230
9478AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002015-01-06 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.016False0.500000160001371000.0625000
9479AVpf3VOfilAPnD_xjpun2015-09-11 18:17:13+00:002018-02-05 08:35:02+00:002015-01-17 00:00:00+00:002017-08-24 00:14:13+00:002017-08-16 12:56:00+00:000.050102110.014False0.000000140001370010.1428570

Duplicate rows

Most frequently occurring

iddateAddeddateUpdatedreviews.datereviews.dateAddedreviews.dateSeenreviews.numHelpfulreviews.ratingpurchase_missing_flagpurchase_encodedrecommend_missing_flagrecommend_encodedhelpful_missing_flagno_helpful_votes_flaglog_helpfultext_lengthis_shortsentiment_polarityreview_lengthusername_dup_flagmulti_review_same_day_flagmulti_review_same_product_flagbrand_encodedcategory_group_encodedproduct_name_match_flagunrelated_product_flagsemantic_mismatch_scorerepetition_scorefake_review_label# duplicates
0AV1YGDqsGV-KLJ3adc-O2017-07-18 23:46:09+00:002018-02-05 08:34:58+00:002014-10-30 00:00:00+00:002017-08-05 07:43:50+00:002017-07-19 23:58:00+00:000.050102110.09False0.65000090006900010.11111102
1AV1YGDqsGV-KLJ3adc-O2017-07-18 23:46:09+00:002018-02-05 08:34:58+00:002014-10-30 00:00:00+00:002017-09-25 16:58:52+00:002017-09-18 03:50:00+00:000.050102110.09False0.00000090006900010.11111102
2AV1YGDqsGV-KLJ3adc-O2017-07-18 23:46:09+00:002018-02-05 08:34:58+00:002014-10-30 00:00:00+00:002017-09-25 16:58:52+00:002017-09-18 03:50:00+00:000.050102110.012False0.366667120006900010.08333302
3AV1YmDL9vKc47QAVgr7_2017-07-19 02:05:55+00:002018-02-05 11:27:11+00:002016-10-05 00:00:00+00:002017-09-11 02:56:19+00:002017-09-04 04:19:00+00:000.050102110.014False0.46500014000231000.07142902
4AV1l8zRZvKc47QAVhnAv2017-07-21 16:20:23+00:002018-02-05 11:28:34+00:002015-05-26 00:00:00+00:002017-09-24 08:19:42+00:002017-09-03 19:47:00+00:000.050102110.011False0.000000110004180010.09090902
5AV1l8zRZvKc47QAVhnAv2017-07-21 16:20:23+00:002018-02-05 11:28:34+00:002015-06-01 00:00:00+00:002017-09-24 08:19:42+00:002017-09-03 19:47:00+00:000.050102110.015False0.250000150004180010.06666702
6AVpe41TqilAPnD_xQH3d2017-01-15 18:09:31+00:002018-02-05 08:36:37+00:002016-12-14 00:00:00+00:002017-09-24 02:17:19+00:002017-09-21 07:50:00+00:000.051002010.06False0.00000060001771110.16666702
7AVpe6FpaLJeJML43yBuP2015-11-13 04:18:48+00:002018-02-05 08:36:40+00:002014-09-25 00:00:00+00:002017-09-27 10:39:26+00:002017-09-08 02:54:00+00:000.051002010.07False0.70000071116770010.14285712
8AVpf0eb2LJeJML43EVSt2015-11-15 09:05:44+00:002018-02-05 08:37:24+00:002015-02-21 00:00:00+00:002017-09-20 23:03:37+00:002017-09-02 06:16:00+00:000.051002010.08False-0.25000080005271110.12500002
9AVpf0eb2LJeJML43EVSt2015-11-15 09:05:44+00:002018-02-05 08:37:24+00:002017-01-20 00:00:00+00:002017-09-20 23:03:37+00:002017-09-02 06:15:00+00:000.051002010.07False0.00000070005270120.14285702